Android中使用自带TextToSpeech实现离线语音合成功能

您所在的位置：网站首页 › tts 安卓语音 › Android中使用自带TextToSpeech实现离线语音合成功能

Android中使用自带TextToSpeech实现离线语音合成功能

2023-12-18 00:45| 来源: 网络整理| 查看: 265

场景

需要实现在安卓端将一段文字进行语音合成并播报。

此过程可离线不需要网络，不用借助第三方形如科大讯飞或者百度等语音合成SDK或者相关工具等。

注：

博客：https://blog.csdn.net/badao_liumang_qizhi 关注公众号霸道的程序猿获取编程相关电子书、教程推送与免费下载。

实现 TextToSpeech

TextToSpeech能将一段文字转换为语音。

TextToSpeech是Android系统自带的类，不用导入别的包

实现效果

下载地址：

https://files.cnblogs.com/files/badaoliumangqizhi/%E8%AF%AD%E9%9F%B3%E6%92%AD%E6%8A%A5.rar

页面实现

为了搭建测试demo首先在layout中某页面上添加一个PlainText和一个Button实现页面布局如下

并给这两个组件添加Id属性。

然后在对应的activity中的onCreate方法中

Button button = (Button)findViewById(R.id.button); EditText editText = (EditText )findViewById(R.id.editTextTextPersonName2); button.setOnClickListener(new View.OnClickListener(){ @Override public void onClick(View v) { SpeechUtils.getInstance(LoginActivity.this).speakText(editText.getText().toString()); } });

获取到两个控件，并获取到PlainText控件的Text和设置Button按钮的点击事件。

可以看到在点击事件中调用了一个工具类SpeechUtils的一个方法speakText。

这个方法就是为了进行语音合成播放方便封装的工具类。只需要给工具类方法中传入要进行语音合成的String字符串内容即可。

这里将工具类设计成单例模式。

在项目某目录下新建一个工具类的包，并新建一个SpeechUtils，代码如下

import android.content.Context; import android.speech.tts.TextToSpeech; import android.util.Log; import android.widget.Toast; import java.util.Locale; public class SpeechUtils { private Context context; private static final String TAG = "SpeechUtils"; private static SpeechUtils singleton; private TextToSpeech textToSpeech; // TTS对象 public static SpeechUtils getInstance(Context context) { if (singleton == null) { synchronized (SpeechUtils.class) { if (singleton == null) { singleton = new SpeechUtils(context); } } } return singleton; } public SpeechUtils(Context context) { this.context = context; textToSpeech = new TextToSpeech(context, new TextToSpeech.OnInitListener() { @Override public void onInit(int i) { if (i == TextToSpeech.SUCCESS) { //textToSpeech.setLanguage(Locale.US); textToSpeech.setLanguage(Locale.CHINA); textToSpeech.setPitch(1.5f);// 设置音调，值越大声音越尖（女生），值越小则变成男声,1.0是常规 textToSpeech.setSpeechRate(0.5f); } } }); } public void speakText(String text) { if (textToSpeech != null) { textToSpeech.speak(text, TextToSpeech.QUEUE_FLUSH, null); } } }

注意：

1.以上代码设计为单例模式，在调用时直接使用

SpeechUtils.getInstance(LoginActivity.this).speakText(editText.getText().toString());

去调用，其中第一个参数是Context对象，如果是在Activity中，必须使用Activity的名字.this,不能直接使用this。

2.在工具类中

import android.speech.tts.TextToSpeech;

可以看到TextToSpeech是直接在android包下引入的是自带的，没有引入其他第三方的依赖。

3.关于设置语言支持的问题，此前网络上大多说不支持中文，应该是很老的版本不支持中文，修改语言的位置在如下，

可以看到支持的语言种类很多，并且已经支持中文。所以只需要设置

textToSpeech.setLanguage(Locale.CHINA);

4.其他设置的属性

textToSpeech.setPitch(1.5f);// 设置音调，值越大声音越尖（女生），值越小则变成男声,1.0是常规 textToSpeech.setSpeechRate(0.5f);//设置速度

5.更多属性API可以参照Android官方文档：

https://developer.android.google.cn/reference/android/speech/tts/TextToSpeech

官方文档部分：

TextToSpeech

Kotlin |Java

public class TextToSpeechextends Object

java.lang.Object ↳android.speech.tts.TextToSpeech

Synthesizes speech from text for immediate playback or to create a sound file.

A TextToSpeech instance can only be used to synthesize text once it has completed its initialization. Implement the TextToSpeech.OnInitListener to be notified of the completion of the initialization. When you are done using the TextToSpeech instance, call the shutdown() method to release the native resources used by the TextToSpeech engine. Apps targeting Android 11 that use text-to-speech should declare TextToSpeech.Engine#INTENT_ACTION_TTS_SERVICE in the queries elements of their manifest:

...

Summary Nested classes classTextToSpeech.Engine

Constants and parameter names for controlling text-to-speech.

classTextToSpeech.EngineInfo

Information about an installed text-to-speech engine.

interfaceTextToSpeech.OnInitListener

Interface definition of a callback to be invoked indicating the completion of the TextToSpeech engine initialization.

interfaceTextToSpeech.OnUtteranceCompletedListener

This interface was deprecated in API level 18. Use UtteranceProgressListener instead.

Constants StringACTION_TTS_QUEUE_PROCESSING_COMPLETED

Broadcast Action: The TextToSpeech synthesizer has completed processing of all the text in the speech queue.

intERROR

Denotes a generic operation failure.

intERROR_INVALID_REQUEST

Denotes a failure caused by an invalid request.

intERROR_NETWORK

Denotes a failure caused by a network connectivity problems.

intERROR_NETWORK_TIMEOUT

Denotes a failure caused by network timeout.

intERROR_NOT_INSTALLED_YET

Denotes a failure caused by an unfinished download of the voice data.

intERROR_OUTPUT

Denotes a failure related to the output (audio device or a file).

intERROR_SERVICE

Denotes a failure of a TTS service.

intERROR_SYNTHESIS

Denotes a failure of a TTS engine to synthesize the given input.

intLANG_AVAILABLE

Denotes the language is available for the language by the locale, but not the country and variant.

intLANG_COUNTRY_AVAILABLE

Denotes the language is available for the language and country specified by the locale, but not the variant.

intLANG_COUNTRY_VAR_AVAILABLE

Denotes the language is available exactly as specified by the locale.

intLANG_MISSING_DATA

Denotes the language data is missing.

intLANG_NOT_SUPPORTED

Denotes the language is not supported.

intQUEUE_ADD

Queue mode where the new entry is added at the end of the playback queue.

intQUEUE_FLUSH

Queue mode where all entries in the playback queue (media to be played and text to be synthesized) are dropped and replaced by the new entry.

intSTOPPED

Denotes a stop requested by a client.

intSUCCESS

Denotes a successful operation.

Public constructors TextToSpeech(Context context, TextToSpeech.OnInitListener listener)

The constructor for the TextToSpeech class, using the default TTS engine.

TextToSpeech(Context context, TextToSpeech.OnInitListener listener, String engine)

The constructor for the TextToSpeech class, using the given TTS engine.

Public methods intaddEarcon(String earcon, String packagename, int resourceId)

Adds a mapping between a string of text and a sound resource in a package.

intaddEarcon(String earcon, String filename)

This method was deprecated in API level 21. As of API level 21, replaced by addEarcon(java.lang.String, java.io.File).

intaddEarcon(String earcon, File file)

Adds a mapping between a string of text and a sound file.

intaddSpeech(CharSequence text, File file)

Adds a mapping between a CharSequence (may be spanned with TtsSpans and a sound file.

intaddSpeech(String text, String packagename, int resourceId)

Adds a mapping between a string of text and a sound resource in a package.

intaddSpeech(CharSequence text, String packagename, int resourceId)

Adds a mapping between a CharSequence (may be spanned with TtsSpans) of text and a sound resource in a package.

intaddSpeech(String text, String filename)

Adds a mapping between a string of text and a sound file.

booleanareDefaultsEnforced()

Checks whether the user's settings should override settings requested by the calling application.

SetgetAvailableLanguages()

Query the engine about the set of available languages.

StringgetDefaultEngine()

Gets the package name of the default speech synthesis engine.

LocalegetDefaultLanguage()

This method was deprecated in API level 21. As of API level 21, use getDefaultVoice().getLocale() (getDefaultVoice())

VoicegetDefaultVoice()

Returns a Voice instance that's the default voice for the default Text-to-speech language.

ListgetEngines()

Gets a list of all installed TTS engines.

SetgetFeatures(Locale locale)

This method was deprecated in API level 21. As of API level 21, please use voices. In order to query features of the voice, call getVoices() to retrieve the list of available voices and Voice#getFeatures() to retrieve the set of features.

LocalegetLanguage()

This method was deprecated in API level 21. As of API level 21, please use getVoice().getLocale() (getVoice()).

static intgetMaxSpeechInputLength()

Limit of length of input string passed to speak and synthesizeToFile.

VoicegetVoice()

Returns a Voice instance describing the voice currently being used for synthesis requests sent to the TextToSpeech engine.

SetgetVoices()

Query the engine about the set of available voices.

intisLanguageAvailable(Locale loc)

Checks if the specified language as represented by the Locale is available and supported.

booleanisSpeaking()

Checks whether the TTS engine is busy speaking.

intplayEarcon(String earcon, int queueMode, HashMap params)

This method was deprecated in API level 21. As of API level 21, replaced by playEarcon(java.lang.String, int, android.os.Bundle, java.lang.String).

intplayEarcon(String earcon, int queueMode, Bundle params, String utteranceId)

Plays the earcon using the specified queueing mode and parameters.

intplaySilence(long durationInMs, int queueMode, HashMap params)

This method was deprecated in API level 21. As of API level 21, replaced by playSilentUtterance(long, int, java.lang.String).

intplaySilentUtterance(long durationInMs, int queueMode, String utteranceId)

Plays silence for the specified amount of time using the specified queue mode.

intsetAudioAttributes(AudioAttributes audioAttributes)

Sets the audio attributes to be used when speaking text or playing back a file.

intsetEngineByPackageName(String enginePackageName)

This method was deprecated in API level 15. This doesn't inform callers when the TTS engine has been initialized. TextToSpeech(android.content.Context, android.speech.tts.TextToSpeech.OnInitListener, java.lang.String) can be used with the appropriate engine name. Also, there is no guarantee that the engine specified will be loaded. If it isn't installed or disabled, the user / system wide defaults will apply.

intsetLanguage(Locale loc)

Sets the text-to-speech language.

intsetOnUtteranceCompletedListener(TextToSpeech.OnUtteranceCompletedListener listener)

This method was deprecated in API level 15. Use setOnUtteranceProgressListener(android.speech.tts.UtteranceProgressListener) instead.

intsetOnUtteranceProgressListener(UtteranceProgressListener listener)

Sets the listener that will be notified of various events related to the synthesis of a given utterance.

intsetPitch(float pitch)

Sets the speech pitch for the TextToSpeech engine.

intsetSpeechRate(float speechRate)

Sets the speech rate.

intsetVoice(Voice voice)

Sets the text-to-speech voice.

voidshutdown()

Releases the resources used by the TextToSpeech engine.

intspeak(CharSequence text, int queueMode, Bundle params, String utteranceId)

Speaks the text using the specified queuing strategy and speech parameters, the text may be spanned with TtsSpans.

intspeak(String text, int queueMode, HashMap params)

This method was deprecated in API level 21. As of API level 21, replaced by speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String).

intstop()

Interrupts the current utterance (whether played or rendered to file) and discards other utterances in the queue.

intsynthesizeToFile(CharSequence text, Bundle params, ParcelFileDescriptor fileDescriptor, String utteranceId)

Synthesizes the given text to a ParcelFileDescriptor using the specified parameters.

intsynthesizeToFile(CharSequence text, Bundle params, File file, String utteranceId)

Synthesizes the given text to a file using the specified parameters.

intsynthesizeToFile(String text, HashMap params, String filename)

This method was deprecated in API level 21. As of API level 21, replaced by synthesizeToFile(java.lang.CharSequence, android.os.Bundle, java.io.File, java.lang.String).

Inherited methods From class java.lang.Object Constants ACTION_TTS_QUEUE_PROCESSING_COMPLETED

Added in API level 4

public static final String ACTION_TTS_QUEUE_PROCESSING_COMPLETED

Broadcast Action: The TextToSpeech synthesizer has completed processing of all the text in the speech queue. Note that this notifies callers when the engine has finished has processing text data. Audio playback might not have completed (or even started) at this point. If you wish to be notified when this happens, see OnUtteranceCompletedListener.

Constant Value: "android.speech.tts.TTS_QUEUE_PROCESSING_COMPLETED"

ERROR

Added in API level 4

public static final int ERROR

Denotes a generic operation failure.

Constant Value: -1 (0xffffffff)

ERROR_INVALID_REQUEST

Added in API level 21

public static final int ERROR_INVALID_REQUEST

Denotes a failure caused by an invalid request.

Constant Value: -8 (0xfffffff8)

ERROR_NETWORK

Added in API level 21

public static final int ERROR_NETWORK

Denotes a failure caused by a network connectivity problems.

Constant Value: -6 (0xfffffffa)

ERROR_NETWORK_TIMEOUT

Added in API level 21

public static final int ERROR_NETWORK_TIMEOUT

Denotes a failure caused by network timeout.

Constant Value: -7 (0xfffffff9)

ERROR_NOT_INSTALLED_YET

Added in API level 21

public static final int ERROR_NOT_INSTALLED_YET

Denotes a failure caused by an unfinished download of the voice data.

See also:

TextToSpeech.Engine.KEY_FEATURE_NOT_INSTALLED

Constant Value: -9 (0xfffffff7)

ERROR_OUTPUT

Added in API level 21

public static final int ERROR_OUTPUT

Denotes a failure related to the output (audio device or a file).

Constant Value: -5 (0xfffffffb)

ERROR_SERVICE

Added in API level 21

public static final int ERROR_SERVICE

Denotes a failure of a TTS service.

Constant Value: -4 (0xfffffffc)

ERROR_SYNTHESIS

Added in API level 21

public static final int ERROR_SYNTHESIS

Denotes a failure of a TTS engine to synthesize the given input.

Constant Value: -3 (0xfffffffd)

LANG_AVAILABLE

Added in API level 4

public static final int LANG_AVAILABLE

Denotes the language is available for the language by the locale, but not the country and variant.

Constant Value: 0 (0x00000000)

LANG_COUNTRY_AVAILABLE

Added in API level 4

public static final int LANG_COUNTRY_AVAILABLE

Denotes the language is available for the language and country specified by the locale, but not the variant.

Constant Value: 1 (0x00000001)

LANG_COUNTRY_VAR_AVAILABLE

Added in API level 4

public static final int LANG_COUNTRY_VAR_AVAILABLE

Denotes the language is available exactly as specified by the locale.

Constant Value: 2 (0x00000002)

LANG_MISSING_DATA

Added in API level 4

public static final int LANG_MISSING_DATA

Denotes the language data is missing.

Constant Value: -1 (0xffffffff)

LANG_NOT_SUPPORTED

Added in API level 4

public static final int LANG_NOT_SUPPORTED

Denotes the language is not supported.

Constant Value: -2 (0xfffffffe)

QUEUE_ADD

Added in API level 4

public static final int QUEUE_ADD

Queue mode where the new entry is added at the end of the playback queue.

Constant Value: 1 (0x00000001)

QUEUE_FLUSH

Added in API level 4

public static final int QUEUE_FLUSH

Queue mode where all entries in the playback queue (media to be played and text to be synthesized) are dropped and replaced by the new entry. Queues are flushed with respect to a given calling app. Entries in the queue from other callees are not discarded.

Constant Value: 0 (0x00000000)

STOPPED

Added in API level 21

public static final int STOPPED

Denotes a stop requested by a client. It's used only on the service side of the API, client should never expect to see this result code.

Constant Value: -2 (0xfffffffe)

SUCCESS

Added in API level 4

public static final int SUCCESS

Denotes a successful operation.

Constant Value: 0 (0x00000000)

Public constructors TextToSpeech

Added in API level 4

public TextToSpeech (Context context, TextToSpeech.OnInitListener listener)

The constructor for the TextToSpeech class, using the default TTS engine. This will also initialize the associated TextToSpeech engine if it isn't already running.

ParameterscontextContext: The context this instance is running in.

listenerTextToSpeech.OnInitListener: The TextToSpeech.OnInitListener that will be called when the TextToSpeech engine has initialized. In a case of a failure the listener may be called immediately, before TextToSpeech instance is fully constructed.

TextToSpeech

Added in API level 14

public TextToSpeech (Context context, TextToSpeech.OnInitListener listener, String engine)

The constructor for the TextToSpeech class, using the given TTS engine. This will also initialize the associated TextToSpeech engine if it isn't already running.

ParameterscontextContext: The context this instance is running in.

engineString: Package name of the TTS engine to use.

Public methods addEarcon

Added in API level 4

public int addEarcon (String earcon, String packagename, int resourceId)

Adds a mapping between a string of text and a sound resource in a package. Use this to add custom earcons.

ParametersearconString: The name of the earcon. Example: "[tick]"

packagenameString: the package name of the application that contains the resource. This can for instance be the package name of your own application. Example: "com.google.marvin.compass" The package name can be found in the AndroidManifest.xml of the application containing the resource.

resourceIdint: Example: R.raw.tick_snd

ReturnsintCode indicating success or failure. See ERROR and SUCCESS.

See also:

playEarcon(String, int, HashMap) addEarcon

Added in API level 4 Deprecated in API level 21

public int addEarcon (String earcon, String filename)

This method was deprecated in API level 21. As of API level 21, replaced by addEarcon(java.lang.String, java.io.File).

Adds a mapping between a string of text and a sound file. Use this to add custom earcons.

ParametersearconString: The name of the earcon. Example: "[tick]"

filenameString: The full path to the sound file (for example: "/sdcard/mysounds/tick.wav")

ReturnsintCode indicating success or failure. See ERROR and SUCCESS.

See also:

playEarcon(String, int, HashMap) addEarcon

Added in API level 21

public int addEarcon (String earcon, File file)

Adds a mapping between a string of text and a sound file. Use this to add custom earcons.

ParametersearconString: The name of the earcon. Example: "[tick]"

fileFile: File object pointing to the sound file.

ReturnsintCode indicating success or failure. See ERROR and SUCCESS.

See also:

playEarcon(String, int, HashMap) addSpeech

Added in API level 21

public int addSpeech (CharSequence text, File file)

Adds a mapping between a CharSequence (may be spanned with TtsSpans and a sound file. Using this, it is possible to add custom pronounciations for a string of text. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.

ParameterstextCharSequence: The string of text. Example: "south_south_east"

fileFile: File object pointing to the sound file.

ReturnsintCode indicating success or failure. See ERROR and SUCCESS.

addSpeech

Added in API level 4

public int addSpeech (String text, String packagename, int resourceId)

Adds a mapping between a string of text and a sound resource in a package. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.

ParameterstextString: The string of text. Example: "south_south_east"

packagenameString: Pass the packagename of the application that contains the resource. If the resource is in your own application (this is the most common case), then put the packagename of your application here. Example: "com.google.marvin.compass" The packagename can be found in the AndroidManifest.xml of your application.

resourceIdint: Example: R.raw.south_south_east

ReturnsintCode indicating success or failure. See ERROR and SUCCESS.

addSpeech

Added in API level 21

public int addSpeech (CharSequence text, String packagename, int resourceId)

Adds a mapping between a CharSequence (may be spanned with TtsSpans) of text and a sound resource in a package. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.

ParameterstextCharSequence: The string of text. Example: "south_south_east"

resourceIdint: Example: R.raw.south_south_east

ReturnsintCode indicating success or failure. See ERROR and SUCCESS.

addSpeech

Added in API level 4

public int addSpeech (String text, String filename)

Adds a mapping between a string of text and a sound file. Using this, it is possible to add custom pronounciations for a string of text. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.

ParameterstextString: The string of text. Example: "south_south_east"

filenameString: The full path to the sound file (for example: "/sdcard/mysounds/hello.wav")

ReturnsintCode indicating success or failure. See ERROR and SUCCESS.

areDefaultsEnforced

Added in API level 8 Deprecated in API level 21

public boolean areDefaultsEnforced ()

Checks whether the user's settings should override settings requested by the calling application. As of the Ice cream sandwich release, user settings never forcibly override the app's settings.

Returnsboolean

getAvailableLanguages

Added in API level 21

public Set getAvailableLanguages ()

Query the engine about the set of available languages.

ReturnsSet

getDefaultEngine

Added in API level 8

public String getDefaultEngine ()

Gets the package name of the default speech synthesis engine.

ReturnsStringPackage name of the TTS engine that the user has chosen as their default.

getDefaultLanguage

Added in API level 18 Deprecated in API level 21

public Locale getDefaultLanguage ()

This method was deprecated in API level 21. As of API level 21, use getDefaultVoice().getLocale() (getDefaultVoice())

Returns a Locale instance describing the language currently being used as the default Text-to-speech language. The locale object returned by this method is NOT a valid one. It has identical form to the one in getLanguage(). Please refer to getLanguage() for more information.

ReturnsLocalelanguage, country (if any) and variant (if any) used by the client stored in a Locale instance, or null on error.

getDefaultVoice

Added in API level 21

public Voice getDefaultVoice ()

Returns a Voice instance that's the default voice for the default Text-to-speech language.

ReturnsVoiceThe default voice instance for the default language, or null if not set or on error.

getEngines

Added in API level 14

public List getEngines ()

Gets a list of all installed TTS engines.

ReturnsListA list of engine info objects. The list can be empty, but never null.

getFeatures

Added in API level 15 Deprecated in API level 21

public Set getFeatures (Locale locale)

Queries the engine for the set of features it supports for a given locale. Features can either be framework defined, e.g. TextToSpeech.Engine#KEY_FEATURE_NETWORK_SYNTHESIS or engine specific. Engine specific keys must be prefixed by the name of the engine they are intended for. These keys can be used as parameters to TextToSpeech#speak(String, int, java.util.HashMap) and TextToSpeech#synthesizeToFile(String, java.util.HashMap, String). Features values are strings and their values must meet restrictions described in their documentation.

ParameterslocaleLocale: The locale to query features for.

ReturnsSetSet instance. May return null on error.

getLanguage

Added in API level 4 Deprecated in API level 21

public Locale getLanguage ()

This method was deprecated in API level 21. As of API level 21, please use getVoice().getLocale() (getVoice()).

Returns a Locale instance describing the language currently being used for synthesis requests sent to the TextToSpeech engine. In Android 4.2 and before (API

【本文地址】

公司简介

联系我们